Adding Glottal Source Information to Intra-Lingual Voice Conversion

نویسندگان

  • Javier Pérez
  • Antonio Bonafonte
چکیده

This paper studies the inclusion of glottal source characteristics in voice conversion (VC) systems. We use source/filter decomposition to parametrize the vocal tract using LSF, the glottal source using the LF model, and the aspiration noise using amplitude-modulated high-pass filtered AWGN noise. To evaluate the impact of this new parametrization in VC, we use a reference conversion system that estimates a linear transformation function using a joint target/source model obtained with CART and GMM. The reference system is based on the LPC model, uses LSF to represent the vocal tract and a selection technique for the residual. We use the reference algorithm to build a VC system for each of the three parameter sets. We compared both parametrizations in the framework of an intralingual voice conversion task in Spanish. The results show that the new source/filter representation clearly improves the overall performance, both in terms of speaker identity transformation and voice quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The linear transformation of LF glottal waveforms for voice conversion

Most Voice Conversion (VC) systems exploit source-filter decomposition based on linear prediction (LP) to transform spectral envelopes, incurring as a result various issues related to the oversimplification of the LP voice source model. Whilst residual prediction methods can mitigate this problem, they cannot be used to modify voice source quality. In this paper, a system which employs linear t...

متن کامل

Intra-Lingual and Cross-Lingual Prosody Modelling

Statistical Parametric Speech Synthesis (SPSS) offers flexibility and computational advantage compared to other methods for Text-to-Speech Synthesis. While the speech output is intelligible, statistically trained voices are less natural due to the amount of signal processing and statistical averaging that goes into building the models. Much of the blame for the lack of naturalness falls on the ...

متن کامل

Voice conversion for non-parallel datasets using dynamic kernel partial least squares regression

Voice conversion aims at converting speech from one speaker to sound as if it was spoken by another specific speaker. The most popular voice conversion approach based on Gaussian mixture modeling tends to suffer either from model overfitting or oversmoothing. To overcome the shortcomings of the traditional approach, we recently proposed to use dynamic kernel partial least squares (DKPLS) regres...

متن کامل

Voice Conversion of Non-aligned Data using Unit Selection

Voice conversion (VC) technology allows to transform the voice of the source speaker so that it is perceived as the voice of a target speaker. One of the applications of VC is speech-to-speech translation where the voice has to inform, not only about what is said, but also about who is the speaker. This paper introduces the different methods submitted by UPC to the TC-STAR second evaluation cam...

متن کامل

A flexible and modular crosslingual voice conversion system

A cross-lingual voice conversion system aims at modifying the timbral structure of recorded sentences from a source speaker, in order to obtain processed sentences which are perceived as the same sentences uttered by a target speaker. This work presents the cross-lingual voice conversion problem as a network of related sub-problems and discuss several techniques for solving each of these sub-pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011